A study of HMM-based automatic segmentations for Thai continuous speech recognition system

نویسندگان

  • Pongthai Tarsaku
  • Supphanat Kanokphara
چکیده

Speech segmentations have been widely using in many speech applications. In speech synthesis, the quality of produced speech depends on the accuracy of labeled acoustic inventory. In speech recognition, segmented utterances according to the labels are usually used as a starting point for training speech models. The segmentation is often manually encoded which is timeconsumption process and has low precision and consistency in some parts of speech. Therefore, the new better technique for speech segmentation using HMM is proposed. In this framework, the effects of manual and automatic segmentation are examined by using the output of final application, the word accuracy of speech recognition. From the experiment, manual segmentation has only 0.59 % better than mono-phone automatic segmentation. The result convinces that tri-phone automatic segmentation will give a better result in the future work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...

متن کامل

Automatic pronunciation scoring for language instruction

This work is part of an effort aimed at developing computer-based systems for language instruction; we address the task of grading the pronunciation quality of the speech of a student of a foreign language. The automatic grading system uses SRI's Decipher™ continuous speech recognition system to generate phonetic segmentations. Based on these segmentations and probabilistic models we produce pr...

متن کامل

Hmm Based Speech Recognition of Continuous Thai Digits

Progress on speech recognition of Thai digit strings is presented in this paper. HTK 3.0 was chosen to implement the HMM-based speech recognizer. MFCCs and their delta and delta-delta terms were used as speech features. Several set of HMM parameters were investigated. Two kinds of word searching methods were tried. Recognition accuracy of 98.7% on test data was achieved with a fixed length word...

متن کامل

Continuous Speech Recognition Using Segmental Neural Nets

We present the concept of a "Segmental Neural Net" (SNN) for phonetic modeling in continuous speech recognition. The SNN takes as input all the frames of a phonetic segment and gives as output an estimate of the probability of each of the phonemes, given the input segment. By taking into account all the frames of a phonetic segment simultaneously, the SNN overcomes the wellknown conditional-ind...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002